Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction
نویسندگان
چکیده
Paraphrases and paraphrasing algorithms have been found of great importance in various natural language processing tasks. While most paraphrase extraction approaches extract equivalent sentences, sentences are an inconvenient unit for further processing, because they are too specific, and often not exact paraphrases. Paraphrase fragment extraction is a technique that post-processes sentential paraphrases and prunes them to more convenient phrase-level units. We present a new approach that uses semantic roles to extract paraphrase fragments from sentence pairs that share semantic content to varying degrees, including full paraphrases. In contrast to previous systems, the use of semantic parses allows for extracting paraphrases with high wording variance and different syntactic categories. The approach is tested on four different input corpora and compared to two previous systems for extracting paraphrase fragments. Our system finds three times as many good paraphrase fragments per sentence pair as the baselines, and at the same time outputs 30% fewer unrelated fragment pairs.
منابع مشابه
Using Predicate-Argument Structures for Information Extraction
In this paper we present a novel, customizable IE paradigm that takes advantage of predicate-argument structures. We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. It is based on: (1) an extended set of features; and (2) inductive decision tree learning. The experimental results prove our claim that accurate predicate-ar...
متن کاملUsing Repeated Patterns across Comparable Articles for Paraphrase Acquisition
We focus on paraphrases for information extraction: expressions which should produce the same extraction output. These expressions are acquired automatically from comparable news articles (articles from the same day, on the same topic). Candidate paraphrases are paths in predicate argument structure starting from matching anchors (typically, names) in the two sentences. By using such syntactica...
متن کاملUtilizing Automatic Predicate-Argument Analysis for Concept Map Mining
Concept maps can be used to provide concise and structured summaries of documents. Motivated by their usefulness in many application scenarios, several approaches have been suggested for concept map mining, the automatic extraction of concept maps from text. However, a major bottleneck of previous work is the common pattern-based approach used to extract concepts and relations from documents wh...
متن کاملRecognizing Textual Relatedness with Predicate-Argument Structures
In this paper, we first compare several strategies to handle the newly proposed three-way Recognizing Textual Entailment (RTE) task. Then we define a new measurement for a pair of texts, called Textual Relatedness, which is a weaker concept than semantic similarity or paraphrase. We show that an alignment model based on the predicate-argument structures using this measurement can help an RTE sy...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کامل